Performance of Machine Learning Algorithms with Different K Values in K-fold CrossValidation
نویسندگان
چکیده
The numerical value of k in a k-fold cross-validation training technique machine learning predictive models is an essential element that impacts the model’s performance. A right choice results better accuracy, while poorly chosen for might affect In literature, most commonly used values are five (5) or ten (10), as these two believed to give test error rate estimates suffer neither from extremely high bias nor very variance. However, there no formal rule. To best our knowledge, few experimental studies attempted investigate effect diverse different models. This paper empirically analyses prevalence and distinct (3, 5, 7, 10, 15 20) on validation performance four well-known algorithms (Gradient Boosting Machine (GBM), Logistic Regression (LR), Decision Tree (DT) K-Nearest Neighbours (KNN)). It was observed model differ one machine-learning algorithm another same classification task. empirical suggest = 7 offers slight increase validations accuracy area under curve measure with lesser computational complexity than 10 across MLA. We discuss detail study outcomes outline some guidelines beginners field selecting given
منابع مشابه
n-fold Commutative Hyper K-ideals
In this paper, we aresupposed to introduce the definitions of n-fold commutative, andimplicative hyper K-ideals. These definitions are thegeneralizations of the definitions of commutative, andimplicative hyper K-ideals, respectively, which have been definedin [12]. Then we obtain some related results. In particular wedetermine the relationships between n-fold implicative hyperK-ideal and n-fol...
متن کاملThe 'K' in K-fold Cross Validation
The K-fold Cross Validation (KCV) technique is one of the most used approaches by practitioners for model selection and error estimation of classifiers. The KCV consists in splitting a dataset into k subsets; then, iteratively, some of them are used to learn the model, while the others are exploited to assess its performance. However, in spite of the KCV success, only practical rule-of-thumb me...
متن کاملMachine learning algorithms in air quality modeling
Modern studies in the field of environment science and engineering show that deterministic models struggle to capture the relationship between the concentration of atmospheric pollutants and their emission sources. The recent advances in statistical modeling based on machine learning approaches have emerged as solution to tackle these issues. It is a fact that, input variable type largely affec...
متن کاملMulti-K Machine Learning Ensembles
Ensemble machine learning models often surpass single models in classification accuracy at the expense of higher computational requirements during training and execution. In this paper we present a novel ensemble algorithm called Multi-K which uses unsupervised clustering as a form of dataset preprocessing to create component models that lead to effective and efficient ensembles. We also presen...
متن کاملComparative Analysis of Machine Learning Algorithms with Optimization Purposes
The field of optimization and machine learning are increasingly interplayed and optimization in different problems leads to the use of machine learning approaches. Machine learning algorithms work in reasonable computational time for specific classes of problems and have important role in extracting knowledge from large amount of data. In this paper, a methodology has been employed to opt...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Information Technology and Computer Science
سال: 2021
ISSN: ['2074-9007', '2074-9015']
DOI: https://doi.org/10.5815/ijitcs.2021.06.05